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Abstract. The Asian box turtles of the Cuora galbinifrons complex (C. galbinifrons, C. bourreti, and C. picturata) rank 
among the most endangered chelonian species in the world. Despite several previous studies, the phylogenetic relation- 
ships and species boundaries of this complex remain a matter for dispute due to a shortage of field-collected samples for 
genetic validation, observed discordance between mitochondrial and nuclear markers, and reported intergradation zones 
combined with a strong tendency of hybridization in the genus. Here, we re-investigate the relationships and potential hy- 
bridization between the species of the C. galbinifrons complex based on the most comprehensive dataset to date consisting 
of 394 morphologically identified specimens (136 C. galbinifrons, 200 C. bourreti, 49 C. picturata, and nine individuals al- 
legedly from Hainan). The turtles mainly came from assurance colonies as well as from zoological and private collections 
across the USA and Europe. Bayesian and Maximum Likelihood analyses of a concatenated mitochondrial dataset (COI 
and ND4 plus adjacent tRNA genes) yielded almost identical topologies, supporting three major clades corresponding to 
C. galbinifrons, C. bourreti, and C. picturata, respectively. In accordance with previous studies, C. bourreti represented the 
sister clade to C. galbinifrons and these two clades together were sister to C. picturata. Haplotype networks revealed pro- 
nounced mitochondrial divergences between the taxa. STRUCTURE and PCA analyses using 12 microsatellite loci also 
confirmed three distinct clusters that are in agreement with the recognized species. Only a few specimens with admixed 
ancestry (hybrids) or mismatched mitochondrial identity were revealed, suggesting extremely limited gene flow among 
the three species. However, this pattern could also reflect the separate captive management of the individual taxa and an 
underrepresentation of geographic contact zones in our sampling. 


Key words. Testudines, Geoemydidae, biogeography, conservation, genetics, endangered species, microsatellites, phylo- 
genetics, Principal Component Analyses, Southeast Asia, STRUCTURE. 


Introduction 


Asian box turtles of the genus Cuora GRAY, 1856 
(Geoemydidae) inhabit the tropical forests of Southeast 
and East Asia. Decades of unsustainable harvest for hu- 
man consumption, traditional medicine, and internation- 
al trade along with habitat destruction and other anthro- 
pogenic pressures have driven most Cuora species to the 
brink of extinction (FIEBIG & LEHR 2000, LAU & SHI 2000, 
LEHR 1997, PARHAM et al. 2001, TCC 2018, VAN DIJK 2000). 


Originally, Cuora galbinifrons BOURRET, 1939 was regard- 
ed as a monotypic species which was only known from a 
handful of individuals (BOURRET 1939, LI 1958, PETZOLD 
1963, 1965). Specimens from Hainan were originally de- 
scribed by Li (1958) as a new subspecies of C. flavomargi- 
nata (GRAY, 1863), before this subspecies was assigned to 
C. galbinifrons and either regarded as the distinct subspe- 
cies C. g. hainanensis (Li, 1958) (e.g., ERNST & BARBOUR 
1989, IVERSON 1986, ZHAO 1986) or synonymized with 
C. galbinifrons (e.g., IVERSON & McCorp 1992, LEHR et al. 
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1998, OBST & REIMANN 1994, ZONG & PAN 1989). Around 
1980, an increasing number of specimens started to ap- 
pear in the international pet trade (BUSKIRK 1988, PAULER 
1980) as the corollary of a non-sustainable harvest for the 
Chinese food market (LEHR 1996, 1997), before local pop- 
ulations collapsed and international protection measures 
were implemented (Lr et al. 2020, MCCORMACK & STUART 
2020, MCCORMACK et al. 2020). As a by-product of this 
overexploitation, several new subspecies of C. galbinifrons 
were described based on material from the pet trade. Two 
of them are now regarded as full species (STUART & PAR- 
HAM 2004, TTWG 2021), C. bourreti OBST & REIMANN, 
1994 and C. picturata LEHR et al., 1998, whereas C. g. serra- 
ta IVERSON & McCorp, 1992 is invalid because this taxon 
was based on species hybrids (PARHAM et al. 2001). 

Using mtDNA sequences (ND4 and COI) of 26 individ- 
uals (9 C. galbinifrons, 5 “C. g. serrata’, 7 C. bourreti, and 5 
C. picturata) STUART & PARHAM (2004) demonstrated the 
C. galbinifrons complex to contain three major monophy- 
letic clades corresponding to C. galbinifrons, C. bourreti, 
and C. picturata. The authors identified all three as inde- 
pendent evolutionary lineages that should be treated as full 
species. Later SPINKS et al. (2012) confirmed these results 
when examining five representatives of the C. galbinifrons 
complex (1 C. bourreti, 2 C. galbinifrons, 2 C. picturata) in 
the frame of a genus-level phylogeny. Based on phyloge- 
netic analyses of the mitochondrial ND1 gene and 16 nu- 
clear loci, they found all three species to be monophylet- 
ic and highly divergent from one another. In both studies, 
C. bourreti represented the sister taxon of C. galbinifrons, 
and these two taxa together were sister to C. picturata. 
However, nuclear networks yielded evidence for gene flow 
and/or incomplete lineage sorting, particularly between 
C. bourreti and C. galbinifrons (SPINKS et al. 2012). In con- 
trast, a recently published global phylogeny of turtles using 
15 nuclear loci (THOMSON et al. 2021) suggested C. bour- 
reti as sister taxon of a clade comprised of C. galbinifrons 
and C. picturata, but the tree presented in the supporting 
information matched the topology reported in other stud- 
ies, suggesting misidentification. In another study, 24 box 
turtles (10 C. galbinifrons, 9 C. bourreti, 5 C. picturata) from 
a Chinese turtle breeding farm were examined (Liu et al. 
2019). These authors compared morphological features, as 
well as sequences for the complete mitochondrial genome, 
the mitochondrial COI gene, and one nuclear marker 
(Rag-1). Their results were inconclusive. Nevertheless, they 
concluded that C. picturata represents a distinct species 
whereas subspecies status should be assigned to C. galbini- 
fronsand C. bourreti. Using osteological evidence, conspec- 
ificity between C. bourreti and C. galbinifrons had already 
been suggested earlier (FRITZ et al. 2006), based on inter- 
mediate character states in putative hybrid individuals. 

Presently, the C. galbinifrons complex is generally ac- 
cepted to contain three species (TTWG 2021). The Indo- 
chinese box turtle (C. galbinifrons) is native to northern 
Vietnam, adjacent northeastern Laos, the Guangxi prov- 
ince of China, and Hainan Island (China). Bourret’s box 
turtle (C. bourreti) is restricted to central Vietnam and 


adjacent eastern Laos (STUART et al. 2011). The Southern 
Vietnamese box turtle (C. picturata) is confined to south- 
ern central Vietnam (Ly et al. 2011, TTWG 2021). 

All three species still face intense collection pressure 
with an estimated population loss of over 90% during the 
last 60 years (MCCORMACK et al. 2020, MCCORMACK & 
STUART 2020, LI et al. 2020). These taxa are classified as 
Critically Endangered by the International Union for Con- 
servation of Nature since 2002 (LI et al. 2020) and have 
been elevated to Appendix I by the Convention on In- 
ternational Trade in Endangered Species in 2022 (CITES 
2023). Cuora galbinifrons, C. bourreti, and C. picturata rank 
among the 25 most endangered chelonian species in the 
world (TCC 2018). 

Despite the obvious need for effective conservation 
planning and management for these three species, there is 
still uncertainty and ongoing discussions regarding their 
validity. The paucity of field-collected samples for genetic 
investigations and the frequent hybridization reported be- 
tween recognized Cuora taxa contribute to the confusion 
regarding phylogenetic relationships and species bounda- 
ries within the C. galbinifrons complex (SPINKS et al. 2012). 
Legislative restrictions (e.g., the ‘Convention on Biologi- 
cal Diversity’ [CBD] and since 2014, the additional ‘Nagoya 
Protocol on Access and Benefit-sharing’) render obtaining 
sufficient numbers of genetic samples for research increas- 
ingly difficult (e.g., NEUMANN et al. 2018, PRATHAPAN et al. 
2018 and references therein). Hence, the taxonomy of the 
C. galbinifrons complex is still not sufficiently resolved, de- 
spite several previous efforts (Liu et al. 2019, SPINKs et al. 
2012, STUART & PARHAM 2004, THOMSON et al. 2021, see 
also FRITZ et al. 2006). 

In the present study, we re-investigate relationships and 
potential hybridization within the C. galbinifrons complex 
based on a comprehensive dataset of 394 individuals. For 
doing so, we combine sequences of two mtDNA fragments 
(COI and ND4 plus adjacent tRNA genes) with informa- 
tion from 12 nuclear microsatellite loci. We supplement 
our dataset with 36 previously published mtDNA sequenc- 
es available from GenBank or provided directly by the au- 
thors (Liu et al. 2019) to allow comparison with previous 
studies. 


Material and methods 
Sampling 


We compiled 394 genetic samples housed in the tissue col- 
lection of the Museum of Zoology, Senckenberg Dresden 
(MTD T), among them 136 morphologically identified 
C. galbinifrons, 200 C. bourreti, 49 C. picturata, and nine 
pet individuals allegedly from Hainan. Among these sam- 
ples are a few for which the complete specimen is preserved 
in the herpetological collection (MTD D) of the same insti- 
tute. For these specimens, we refer below to the voucher in 
the herpetological collection. 

Due to the paucity of field-collected material, our sam- 
ples were mainly obtained from captive assurance colonies 
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across the USA (123 samples) and Europe (234 samples, 68 
of which with known origin) and supplemented with 36 
samples from museum specimens (Fig. 1, Supplementary 
Table S1), all exported to the USA or Europe prior to the 
12 October 2014 when the ‘Nagoya Protocol on Access and 
Benefit-sharing’ took effect. Our samples also include nine 
turtles from the international pet trade, supposedly from 
the putative intergradation zone of C. galbinifrons and 
C. bourreti on Hainan Island. Also, one field-collected sam- 
ple from a turtle morphologically identified as C. bourreti 
from Quang Binh Province, northern central Vietnam, was 
studied (MTD T 357). This site lies in the putative main- 
land hybrid zone of C. galbinifrons and C. bourreti (FRITZ 
et al. 2002, LEHR et al. 1998). Tissue (scute clippings) or a 
small amount of blood (~100 ul) was collected by veteri- 
narians or trained keepers from the jugular vein or the sub- 
carapacial sinus using a1 ml syringe and 22-25 g needle. 
Whole blood was placed in Lithium heparin microtainer 
tubes, while scale clippings were preserved in 90% ethanol. 
All samples were refrigerated until shipping to the molecu- 
lar genetic laboratory of the Museum of Zoology, Sencken- 
berg Dresden. 


Laboratory procedures 


Total genomic DNA of fresh tissue and blood samples was 
extracted using the innuPREP DNA Mini Kit and the in- 
nuPREP Blood DNA Mini Kit (Analytik Jena AG, Jena, 
Germany). Two mitochondrial gene fragments (ND4 
including flanking tRNA genes and the barcoding gene 
COI), commonly used for phylogenetic and phylogeo- 
graphic purposes in geoemydid turtles (e.g., BARTH et 
al. 2004, FRITZ et al. 2008, Spinks et al. 2004, STUART & 
PARHAM 2004) were selected. Both mtDNA fragments 
were amplified using a PCR volume of 25 ul containing 
1.25 units Taq polymerase (Bioron, Ludwigshafen, Ger- 
many) with PCR buffer 10x including MgCl „ 0.2 mM 
of each dNTP (Thermo Scientific, St. Leon-Rot, Germa- 
ny), 0.4 mM of each primer, 0.02 ug/ul bovine serum al- 
bumine (BSA; Thermo Fisher Scientific Inc., Waltham, 
MA, USA), ultrapure H O, and 10-40 ng of total genom- 
ic DNA. To amplify the ND4 fragment, the primer pair 
L-ND4 and H-Leu (STUART & PARHAM 2004) was used, 
while for COI the primers L-turtCOIc and H-turtCOlIc 
(STUART & PARHAM 2004) were used. The thermocycling 
conditions were: initial denaturation at 94°C for 5 min, 
35 cycles with denaturation at 94°C for 45 s, annealing for 
45 s at 57°C and extension at 72°C for 80 s, with a final 
extension step at 72°C for 10 min. PCR products were pu- 
rified using the ExoSAP-IT enzymatic clean-up (USB Eu- 
rope GmbH, Staufen, Germany; 1:20 dilution, modified 
protocol: 30 min at 37°C; 15 min at 80°C) and sequenced 
on an ABI 3730 Genetic Analyzer (Applied Biosystems, 
Foster City, CA, USA) using the BigDye Terminator v3.1 
Cycle Sequencing Kit (Applied Biosystems) and either the 
primers L-ND4int and H-Leuint for the ND4 fragment or 
L-turtCOlIc, and H-turtCOlc for the COI fragment (Stu- 
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ART & PARHAM 2004). Cycle sequencing reactions were 
run with an initial denaturation at 96°C for 60 s, and 25 
cycles with denaturation at 96°C for 10 s, annealing at 
50°C for 5 s, and elongation at 60°C for 76 s. For purifica- 
tion, 400 ul Sephadex (GE Healthcare, München, Germa- 
ny; 1:20 dilution) per well in a Performa DTR V3 96-Well 
Short Plate was used (Edge Biosystems, Gaithersburg, 
MD, USA). COI sequences could be obtained for 345 tur- 
tles (fragment length 741 bp). Sequencing of ND4 (814 bp, 
including 152 bp of flanking tRNA genes) was successful 
for 328 samples (Supplementary Table S1). All sequences 
were checked using the original electropherograms and 
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Figure 1. Distribution of the Cuora galbinifrons complex. Puta- 
tive ranges according to TTWG (2021). Numbers refer to the 
following sample sites: 1: Son La Province, Vietnam; 2: Nghé An 
Province, Vietnam; 3: Pu Mat Nature Reserve, Vietnam; 4: Kham- 
mouane Province, Lao People’s Democratic Republic; 5: Quang 
Binh Province, Vietnam; 6: Quang Binh Province, Vietnam 
(field-collected sample); 7: Quang Tri Province, Vietnam; 8: Thira 
Thién Hué Province, Vietnam; 9: Quang Nam Province, Vietnam; 
10: Quang Ngai Province, Vietnam; 11: Hainan, People’s Republic 
of China. Except for the three FMNH samples, coloration is in 
accordance with microsatellite cluster assignment. Exposed land- 
masses during the Last Glacial Maximum are displayed in light 
grey. Uncertain and unreliable localities are shown as stars. Inset: 
C. bourreti, Turtle Conservation Centre (TCC), Cuc Phuong Na- 
tional Park, Vietnam. 
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subsequently manually aligned in BIOEDIT 7.0.5.2 (HALL 
1999). European Nucleotide Archive (ENA) accession 
numbers and sampling sites are summarized in Supple- 
mentary Table S1. 

In addition, 16 nuclear microsatellite loci previously 
used for taxon delimitation in the genus Cuora (TIEDE- 
MANN et al. 2014) were amplified using multiplex PCRs 
(Supplementary Table S2). Each PCR was conducted in a 
final volume of 10 pl containing 0.5 units Taq polymerase 
(Bioron) together with the buffer 10x recommended by the 
supplier, 2.0 mM MgCl (Bioron), 0.2 mM of each dNTP 
(Thermo Scientific), 0.02 ug/ul of BSA (Thermo Scientif- 
ic), 20-40 ng of total DNA and a specific set of primers 
at a specific concentration as described in Supplementary 
Table S2. For thermocycling conditions, see Supplemen- 
tary Table S3. PCR products were diluted with water in a 
ratio of 1:100. Fragment lengths were determined on an 
ABI 3730 Genetic Analyzer (Applied Biosystems) using the 
GeneScan-600 LIZ Size Standard (Applied Biosystems) 
and the software PEAK SCANNER 1.0 (Life Technologies, 
Carlsbad, CA). Twelve out of the 16 microsatellite loci were 
highly polymorphic and were used for further analyses 
(Supplementary Table S2). 


Data analysis 


Our mtDNA data were supplemented with 33 sequenc- 
es available from GenBank (Supplementary Table S1). 
These include sequences of the only three field-collected 
C. galbinifrons samples available (FMNH 255694, FMNH 
255695, FMNH 256544) from STUART & PARHAM (2004). 
Liu et al. (2019) did not make their sequence data public- 
ly available, but provided upon request 12 COI sequenc- 
es and three mitochondrial genomes. An inspection and 
comparison of the COI sequences generated by Liv et al. 
(2019) revealed multiple alignment gaps suggestive of poor 
sequence quality. Since gaps may indicate sequencing er- 
rors or nuclear mitochondrial insertions (numts), we did 
not use these sequences for our calculations. However, we 
extracted COI and ND4 sequences from the three mito- 
chondrial genomes provided by Liv et al. and incorporated 
these into our dataset (Supplementary Table S1). 

For phylogenetic calculations, the mtDNA sequences 
were concatenated (421 sequences), resulting in an align- 
ment of 1555 bp. The optimal partition scheme and the 
best-fitting nucleotide substitution model was inferred 
using PARTITIONFINDER 2 (LANFEAR et al. 2012, 2016, 
Supplementary Table $4) and the Bayesian Information 
Criterion (BIC). Bayesian trees were calculated with 
MRBAYES 3.2.7 (RONQUIST et al. 2012) and the imple- 
mented Markov chain Monte Carlo (MCMC) algorithm. 
Two independent runs (each with 4 chains) were per- 
formed with 10 million generations each, sampling eve- 
ry 500" generation, until the average standard deviation 
of split frequencies fell below 0.01. Results of the MCMC 
runs were summarized and the initial 25% of each run 
were discarded as burn-in. 


Maximum Likelihood trees were calculated using 
RAxML 8.2.10 (STAMATAKIS 2014) using the default GTR 
+ G model for all partitions. Five independent fast boot- 
strap searches were conducted to compute ML trees, start- 
ing from distinct randomized maximum parsimony trees. 
Subsequently, thorough bootstrap replicates were run un- 
til they converged with a cut-off of 1%. Convergences oc- 
curred after 400 replicates. TRACER 1.7.1 (RAMBAUT et al. 
2018) was used to test for parameter convergence of both 
runs using the Effective Sample Sizes (ESS) of parameters 
prior to generating consensus trees. Homologous sequenc- 
es for Mauremys reevesii (GenBank accession number 
APo19398) along with two Cuora trifasciata (GenBank ac- 
cession numbers KF574821, NCo22857) were used for root- 
ing the phylogenetic trees. 

Parsimony networks were constructed for both mtDNA 
fragments using TCS 1.2.3 (CLEMENT et al. 2000), with 
gaps treated as fifth character state and a connection lim- 
it of 100 steps. Since missing data compromises network 
calculation (Jory et al. 2007), individuals represented by 
short sequences were excluded, resulting in 377 sequences 
for COI and 361 for ND4. 

The 12 nuclear microsatellite loci were analyzed with 
an unsupervised Bayesian clustering approach as im- 
plemented in STRUCTURE 2.3.4 (Husisz et al. 2009, 
PRITCHARD et al. 2000) using the admixture model and 
correlated allele frequencies. STRUCTURE searches the 
data set for partitions which are, as far as possible, in Har- 
dy-Weinberg equilibrium and linkage equilibrium, and 
unsupervised analyses use only genetic data but no infor- 
mation about the collection sites or presumed taxonom- 
ic identity. For all STRUCTURE calculations, the upper 
bound was set arbitrarily to K = 10, and the most like- 
ly number of clusters (K) was determined using the AK 
method (EvANNO et al. 2005) as implemented in STRUC- 
TURE HARVESTER (EARL & VONHOLDT 2012). Calcula- 
tions were repeated 10 times for each K using a MCMC 
chain of 750,000 generations for each run, after a burn- 
in of 250,000 generations. Population structuring and in- 
dividual admixture were visualized using DISTRUCT 1.1 
(ROSENBERG 2004). 

To test the resolution power of STRUCTURE for infer- 
ring hybrid status and pure ancestry, simulations were run 
using HYBRIDLAB 1.0 (NIELSEN et al. 2006). Twenty sam- 
ples each of C. galbinifrons, C. bourreti, and C. picturata 
were selected as pure parental genotypes based on their 
pure microsatellite genotypes and concordant mitochon- 
drial identities. Using these data, 20 genotypes of each hy- 
brid class (F1, F2, and the two backcrosses) were modeled 
in HYBRIDLAB. Then, the obtained simulated hybrid data 
were subjected to analyses using STRUCTURE, together 
with the data of the 20 pure individuals of each species. 
According to our simulation analyses, a threshold of 93% 
(Supplementary Table S5) was used for identifying pure 
cluster membership, treating individuals with lower values 
as having mixed ancestries. 

In addition, we examined microsatellite data of the in- 
ferred clusters by calculating population genetic diversity 
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and divergence indices. Numbers and sizes of microsatel- 
lite alleles were compared by a frequency table produced 
in CONVERT 1.31 (GLAUBITZ 2004). For inferring locus- 
specific observed (H,) and expected heterozygosities (H,) 
and for performing a locus-by-locus analysis of molecular 
variance (AMOVA, 10,000 permutations), ARLEQUIN 
3.5.2.1 (EXCOFFIER & LISCHER 2010) was used. Locus-spe- 
cific excess or deficiency of heterozygotes as expressed by 
the inbreeding coefficient F,, (WEIR & COCKERHAM 1984) 
was calculated with FSTAT 2.9.3.2 (GOUDET 1995). The 
same software was also used for computing values for lo- 
cus-specific allelic richness and testing statistical signifi- 
cance of F, for each locus and across all loci using rand- 
omizations and Bonferroni correction (RICE 1989). Two 
Principal Component Analyses (PCAs) were run on mi- 
crosatellite genotypes to illustrate multidimensional re- 
lationships and to corroborate the STRUCTURE results 
without population-genetic presumptions following the 
recommendations by PUECHMAILLE (2016). PCAs were 
calculated using the ADEGENET package (JoMBART 
2008) for R 3.6.3. The first PCA included only individu- 
als identified as pure by STRUCTURE. The second PCA 
included all genotypes, i.e., from pure C. galbinifrons, 
C. bourreti, C. picturata, and from turtles with admixed 
ancestries. 


Results 


Our phylogenetic analyses of the concatenated mtDNA 
sequences revealed three major clades corresponding to 
C. galbinifrons, C. bourreti, and C. picturata (Fig. 2). Bayesi- 
an and ML approaches yielded almost identical and mostly 
well-supported topologies. In concordance with previous 
publications (Spinks et al. 2012, STUART & PARHAM 2004), 
C. bourreti was retrieved as the sister clade of C. galbini- 
frons and both together were sister to C. picturata. The 
clades containing sequences of C. galbinifrons and C. bour- 
reti showed some weak substructure. 

In a few cases, the phenotypic identity conflicted with 
the phylogenetic placement, suggesting either misidentifi- 
cation or mitochondrial introgression. Sequence data ex- 
tracted from one of the mt-genomes provided by Liu et 
al. (2019) and identified by them as C. bourreti clustered in 
the C. galbinifrons clade. Also, our sample MTD T 16819, 
morphologically identified as C. bourreti, was placed in the 
C. galbinifrons clade, in accordance with its microsatellite 
identity (Supplementary Table S1). MTD D 42945, mor- 
phologically identified as C. bourreti, corresponded mito- 
chondrially to C. picturata. Another live turtle, according 
to its morphology and microsatellite identity a C. galbini- 
frons (MTD T 16738), mitochondrially matched C. bour- 
reti, and two morphologically identified C. picturata 
(MTD D 42872, MTD T 18547) and a GenBank sequence 
(JF712890.1) of C. picturata clustered in the C. bourreti 
clade. The microsatellite genotypes of MTD D 42872 and 
MTD T 18547 support that these turtles are misidentified 
C. bourreti (Supplementary Table S1). 
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Eight samples from trade specimens putatively from 
Hainan Island were placed in the C. galbinifrons clade 
(voucher specimens MTD D 41957, 42575, 42868, 42875, 
42895, 42896, 42944, and 42946). Another alleged Hainan 
turtle occurred, in agreement with its microsatellite geno- 
type (Supplementary Table S1), in the C. bourreti clade 
(MTD D 42869). Our single wild-collected sample from 
Quang Binh Province, northern central Vietnam (MTD T 
357), was placed in the C. bourreti clade. 

Haplotype networks showed three highly distinct hap- 
lotype clusters corresponding to the mtDNA clades in phy- 
logenetic analyses. Each cluster was distinct by at least 16 
mutational steps (Fig. 3). 

For the microsatellite data, the AK method revealed 
K = 3 as the optimal number of clusters, with the three 
clusters matching the three currently recognized species 
and the distinct mtDNA clades (Fig. 4). The analyses re- 
vealed only seven specimens with less than 93% cluster 
membership probability, suggesting admixed ancestry. 
Three individuals with mixed ancestries from C. galbini- 
frons and C. bourreti were captive turtles from the Turtle 
Survival Center, USA (MTD T 16744, Q: 0.671, 0.318; MTD 
T 16763, Q: 0.820, 0.176) and the Allwetterzoo Münster, 
Germany (MTD T 18638, Q: 0.896, 0.100). All three had 
mitochondrial haplotypes of C. galbinifrons (Supplemen- 
tary Table S1). A single individual from the collection of 
the Smithsonian National Zoological Park, USA (MTD T 
16822, Q: 0.840, 0.147), was inferred to have a mixed an- 
cestry between C. bourreti and C. picturata. In both cas- 
es, taxa are involved that theoretically could have natu- 
ral contact and hybrid zones according to their distribu- 
tion. Surprisingly, also three admixed turtles between the 
northernmost species C. galbinifrons and the completely 
allopatric southernmost species C. picturata were found 
(MTD D 42875, Q: 0.838, 0.151, trade specimen, alleg- 
edly from Hainan; MTD D 42643, Q: 0.860, 0.118, trade 
specimen, allegedly from China; MTD T 16840, Q: 0.845, 
0.110, unknown origin, Buffalo Zoo). All three turtles 
yielded mitochondrial haplotypes of C. galbinifrons (Sup- 
plementary Table S1). Our single wild-collected sample 
from Quang Binh Province, Vietnam (MTD T 357), was 
inferred as pure C. bourreti, without admixture. This is in 
agreement with its mitochondrial identity (Supplementa- 
ry Table S1). 

The two PCAs (including and excluding admixed indi- 
viduals, respectively) using microsatellite genotypes con- 
firmed three distinct clusters (Fig. 5, left and right) that 
agree with the distinct species, the three mtDNA clades 
and STRUCTURE clusters (Figs. 4, 5). In the PCA includ- 
ing admixed genotypes, the admixed individuals were ei- 
ther placed between the clusters or were retrieved within 
the C. galbinifrons cluster (Fig. 5, right). 

To assess the genetic diversity within and the diver- 
gence among the three taxa using microsatellites, turtles 
of mixed ancestry were excluded from the dataset. The 
number of alleles per locus ranged from 6 to 39 (Supple- 
mentary Table S2). From a total of 216 alleles, 98 repre- 
sented private alleles (Table 1, see also for further genetic 
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Figure 2. Bayesian consensus tree based on sequences for 415 Cuora samples using a concatenated mitochondrial alignment (1555 bp; 
741 bp: COI, 814 bp: ND4 with flanking tRNA genes). Terminal clades of all taxa were collapsed, outgroups (Mauremys reevesii, 
C. trifasciata) and posterior probabilities of terminal clades not shown for clarity. Posterior probabilities exceeding 0.95 are displayed 
as color-coded nodes as explained in the inset. Bars display species identity for each sample according to phenotype, microsatellite 
analyses, and mtDNA (horizontal color sections; white sections represent missing data). Insets from top to bottom: C. galbinifrons, 
C. bourreti, and C. picturata. Photos by E. LEHR. 
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Figure 3. Parsimony networks for individual mtDNA fragments of Cuora samples. Symbol size corresponds to haplotype frequency; 
lines connecting haplotypes represent one mutation step. Small white circles are missing haplotypes. Colors of circles correspond to 
those of mitochondrial clades (Fig. 2). The haplotypes containing the respective mtDNA fragment of the misidentified C. bourreti 
mt-genome from Liu et al. (2019) are marked by asterisks. 
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Figure 4. Genotypic structuring of 333 box turtles of the genus Cuora for K = 3 using 12 microsatellite loci. Shown is the STRUCTURE 
run with the best probability value. Within each cluster, an individual is represented by a vertical segment that reflects its ancestry. 
Mixed ancestries are indicated by differently colored sections corresponding to inferred genetic percentages of the respective cluster. 
Individuals with admixed ancestry are marked with an asterisk. For each sample, its identity according to mitochondrial lineage is 
shown above the STRUCTURE diagram (see also Supplementary Table S1). 
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Table 1. Genetic diversity indices and population genetic parameters of the pure representatives of the three Cuora species using 12 
microsatellite loci: n = sample size; n a = total number of alleles per cluster; n a = Mean number of alleles per locus; n, = number of 
private alleles; AR, allelic richness; F inbreeding coefficient; * statistically significant; Ho average observed heterozygosity; H, aver- 
age expected heterozygosity; HWE, number of loci in Hardy-Weinberg equilibrium. 


n n, ny n, AR Ei; H, H, HWE 
Cuora bourreti 158 165 13.5 52 10.7 0.250* 0.534 0.712 1 
Cuora galbinifrons 126 148 12.33 37 9.81 0.194* 0.606 0.751 2 
Cuora picturata 42 49 4.08 9 4.07 0.221* 0.254 0.326 5 
Table 2. Pairwise fixation indices (F,. values) for the STRUCTURE Discussion 


clusters (K = 3) of Indochinese box turtles. Turtles with mixed 
ancestries were excluded, resulting in a data set of 326 individu- 
als. All F „ values are significantly different from zero 


Species C. bourreti C. galbinifrons C. picturata 
C. bourreti - 

C. galbinifrons 0.189 - 

C. picturata 0.343 0.417 - 


diversity indices and population genetic parameters). 
Cuora bourreti had the highest number of private alleles 
and the highest diversity indices. The highest F,, value 
was found between C. galbinifrons and C. picturata (F, = 
0.417, Table 2). The AMOVA indicated that 27.12% of the 
observed global variation occurred among pure repre- 
sentatives of the individual species and 72.88% within 
them. 


Analyses of two mtDNA fragments (together 1555 bp) and 
12 microsatellite loci yielded for the Cuora galbinifrons 
complex largely concordant results supporting, in accord- 
ance with morphology, three distinct taxa matching the 
three currently recognized species. A few admixed indi- 
viduals and turtles with mismatched mitochondrial iden- 
tity indicate that hybridization does occasionally occur 
in captivity. Our only wild-collected sample from Quang 
Binh Province, Vietnam, a putative natural hybrid zone of 
C. galbinifrons and C. bourreti (FRITZ et al. 2002), was in- 
ferred as C. bourreti without admixture. This is in accord- 
ance with the morphological identification of the speci- 
men. Based on shell shape and color pattern, LEHR et al. 
(1998) concluded that the Hainan population of what was 
then C. galbinifrons sensu lato is morphologically interme- 
diate between C. galbinifrons and C. bourreti and suggested 
that this population represents a relic of a former intergra- 
dation zone that connected the two taxa when Hainan Is- 
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Figure 5. Principal Component Analysis (PCA) using microsatellite genotypes. Left: PC 1-2 for pure representatives of C. bourreti, 
C. galbinifrons, and C. picturata. The first, second, and third principal components explain 5.63%, 4.03%, and 2.19% of variation, re- 
spectively. Right: PC 1-2 for pure representatives of the three species plus admixed individuals. The first, second, and third principal 
components explain 5.54%, 3.97%, and 2.16% of the variation, respectively. Oval outlines represent 95% confidence intervals; non- 
overlapping lines denote significantly different clusters. Dots correspond to individual genotypes; colors according to STRUCTURE 


clusters. 
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land was part of the Asian mainland during Pleistocene low 
sea level stands. This hypothesis was further supported by 
osteological evidence from three skeletons from Hainan, 
with one specimen showing the character states of C. bour- 
reti(MTD D 40849), the second of C. galbinifrons (MTD D 
42895), and the third with an intermediate character state 
(MTD D 42896, Fritz et al. 2006). Mitochondrial DNA 
sequences of two of these specimens (MTD D 42895 and 
MTD D 42896) placed them in the C. galbinifrons clade, 
but microsatellites could not be amplified. Five trade speci- 
mens allegedly from Hainan were genotypically identified 
as pure C. galbinifrons (MTD D 41957, 42946) or C. bour- 
reti (MTD D 42869), without any evidence of admixture. 
Another pet trade specimen allegedly from Hainan (MTD 
D 42875) had an admixed, but completely unexpected, an- 
cestry and turned out as a hybrid between the northern- 
most species C. galbinifrons and the southernmost species 
C. picturata. Since these two taxa are completely allopat- 
ric, natural hybridization is impossible, suggesting that this 
turtle was captive-bred (e.g., in a turtle farm), where delib- 
erate or accidental cross-breeding occurred. The unexpect- 
ed absence of any signal of admixture between C. galbini- 
frons and C. bourreti among our few trade samples sup- 
posedly from Hainan Island does not necessarily provide 
evidence for the absence of hybridization. It could either 
be the result of faulty or intentionally fabricated localities, 
or caused by genetic backcrossing. Evidence of historical 
gene flow between distinct species is impossible to detect 
using microsatellites alone (BEEBEE & ROWE 2008, SANZ et 
al. 2009, VAMBERGER et al. 2017) because a few generations 
of backcrossing obscure genetic admixture. Thus, more re- 
search is needed including samples with reliable collection 
sites from the putative contact zones of C. galbinifrons and 
C. bourreti and additional marker systems (nuclear loci, 
SNPs, or whole genome sequencing) to either verify or re- 
fute the existence of intergradation or ancient hybridiza- 
tion, both on Hainan and in Vietnam. 

With respect to the identified captive turtles with mixed 
ancestries (MTD T 16744, 16763 from the Turtle Survival 
Center, USA, MTD T 18638 from the Allwetterzoo Miin- 
ster, Germany, and MTD T 16822 from the Smithsonian 
National Zoological Park, USA), we recommend that these 
individuals should be excluded from breeding to prevent 
further genetic pollution. 
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